Intrinsic fault tolerance of multilevel Monte Carlo methods

نویسندگان

  • Stefan Pauli
  • Peter Arbenz
  • Christoph Schwab
چکیده

Monte Carlo (MC) and multilevel Monte Carlo (MLMC) methods applied to solvers for Partial Differential Equations with random input data are proved to exhibit intrinsic failure resilience. Sufficient conditions are provided for non-recoverable loss of a random fraction of MC samples not to fatally damage the asymptotic accuracy vs. work of a MC simulation. Specifically, the convergence behavior of MLMC methods on massively parallel hardware with runtime faults is analyzed mathematically and investigated computationally. Our mathematical model assumes node failures which occur uncorrelated of MC sampling and with general sample failure statistics on the different levels and which also assume absence of checkpointing, i.e., we assume irrecoverable sample failures with complete loss of data. Modifications of the MLMC with enhanced resilience are proposed. The theoretical results are obtained under general statistical models of CPU failure at runtime. Particular attention is paid to node failures with so-called Weibull failure models. We discuss the resilience of massively parallel stochastic Finite Volume computational fluid dynamics simulations.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Stochastic Assessment of Voltage Sags in Distribution Networks

This paper compares fault position and Monte Carlo methods as the most common methods in stochastic assessment of voltage sags. To compare their abilities, symmetrical and unsymmetrical faults with different probability distribution of fault positions along the lines are applied in a test system. The voltage sag magnitude in different nodes of test system is calculated. The problem with the...

متن کامل

Grid computing of Monte Carlo based American option pricing: Analysis of two methods

This paper aims to provide an overview and a performance comparison of some parallel and distributed algorithms for Bermudian-American option pricing. We use two Monte Carlo based methods to address such pricing in the case of a high number of assets (high-dimension) through continuation values classification and optimal exercise boundary computation. Our implementations are supported by a Java...

متن کامل

Reliability and Fault Tolerance of Ultra Low Voltage High Speed Differential CMOS

The reliability and fault tolerance of the differential ultra low voltage gate is elaborated in this paper. The gates optimal yield and defect tolerance compared to ULV gate and standard CMOS is given. The results are obtained through Monte-Carlo simulations.

متن کامل

A fault tolerant implementation of Multi-Level Monte Carlo methods

The theory behind fault tolerant multi-level Monte Carlo (FT-MLMC) methods was recently developed and tested. These tests were made without a real fault tolerant implementation. We implemented an MPI-parallelized fault tolerant MLMC version of an existing parallel MLMC code (ALSVID-UQ). It is based on the User Level Failure Mitigation, a fault tolerant extension of MPI. We confirm our FT-MLMC t...

متن کامل

Probabilistically-induced domain decomposition methods for elliptic boundary-value problems

Monte Carlo as well as quasi-Monte Carlo methods are used to generate only few interfacial values in two-dimensional domains where boundary-value elliptic problems are formulated. This allows for a domain decomposition of the domain. A continuous approximation of the solution is obtained interpolating on such interfaces, and then used as boundary data to split the original problem into fully de...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • J. Parallel Distrib. Comput.

دوره 84  شماره 

صفحات  -

تاریخ انتشار 2015